-
Notifications
You must be signed in to change notification settings - Fork 58
Support for Ollama Embedding models in Java. (ollama4j) #250
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Hi @mcolomerc, I noticed you've been updating this PR for the past week. First of all, thanks for your contribution. And I'm sorry for the late response and review on this. We've been busy with the 0.1.0 release and the Flink Forward conference. Just want you to know that I should be able to get back to the community and review this next week. |
yes, basically updated my fork and updated the poms. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi, @mcolomerc, thanks for your contribution.
The overall architecture looks good to me, I left some minor comments.
...a/org/apache/flink/agents/integrations/embeddingmodels/ollama/OllamaEmbeddingModelSetup.java
Outdated
Show resolved
Hide resolved
* @param model The specific embedding model to use | ||
* @return An array of floating-point values representing the text embeddings | ||
*/ | ||
public float[] embed(String text, String model) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it be possible to define these embed methods in the parent class?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
added these methods to the parent class, needed to add them as abstract methods to the base interface, forcing all embedding model implementations to support them. what do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be fine not to declare it as an abstract method, since most implementing classes may not need to modify the implementation of these methods.
examples/src/main/java/org/apache/flink/agents/examples/agents/EmbeddingsAgent.java
Outdated
Show resolved
Hide resolved
…n property in integrations/pom.xml and reference it in embedding-models/ollama and chat-models/ollama. See: apache#250
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry for I forgot a comment in last review.
} | ||
|
||
@Override | ||
public float[] embed(String text) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we could add an additional parameter Map<String, Object> parameters
here, like chat
method of BaseChatModelConnection
.
/**
* Generate embeddings for a single text input.
*
* @param text The input text to generate embeddings for
* @param parameters The additional arguments to configure the embed request.
* @return An array of floating-point values representing the text embeddings
*/
public abstract float[] embed(String text, Map<String, Object> parameters);
Then we can remove the embed(String text, String model)
and get model
name from parameters
.
I recommend this implementation because, in our design, different instances of EmbeddingModelSetup
can be initialized with different configuration parameters, yet they can share the same EmbeddingModelConnection
. When an EmbeddingModelSetup
instance calls the embed method of EmbeddingModelConnection
, it can pass its own unique configuration parameters, including model_name
, encoding_format
, and others.
You can take a look at python implementation as a reference.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense. I did a refactor to include the approach and now it's more aligned with BaseChatModelConnection.
Linked issue: #144
JAVA API for Embedding models.
Ollama support fro Embedding models with ollama4j.
Purpose of change
Support for Ollama Embedding models in Java. (ollama4j)
Tests
OllamaEmbeddingModelConnectionTest
Example: WorkflowEmbeddingsAgentExampleJob provided.
API
Annotations: EmbeddingModelConnection and EmbeddingModelSetup annotations.
embedding.model: BaseEmbeddingModelConnection and BaseEmbeddingModelSetup
Documentation
Integrations: Added Ollama integration for using embedding models.